Fluctuations around Nash Equilibria in Game Theory 
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We investigate the fluctuations induced by irrationaUty in simple games with a large number of 
competing players. We show that Nash equilibria in such games are "weakly" stable: irrationality 
propagates and amplifies through players' interactions so that huge fluctuations can results from a 
small amount of irrationality. In the presence of multiple Nash equilibria, our statistical approach 
allows to establish which is the globally stable equilibrium. However characteristic times to reach 
this state can be very large. 
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On ' Game theory Q provides strategical thinking for modern economics and socio-political decisions. It has been an 

• active research subject, in the economist's community, in the past half century. Extremely refined analysis is now 
^ ' being preformed, however it was recently observed that such studies provide a far too idealised picture of the real 

world in economics. One of the main pitfalls lies in the fact that the assumption of rationality makes game theory 
deterministic. On the other hand, we know that the economic world is characterized by large fluctuations. These 
\^ , fluctuations are, for obvious reasons, of great interest for people in economics. They are becoming of great interest 
^-H ' also in the physicists community, where it has been realized that economic systems share scaling and self organized 
] critical behaviors with more traditional subjects in statistical physics. 

In real world irrationality is ubiquitous. This gives us reason to use physics tools to include it in game theory. 
Strikingly we find that irrationality propagates and amplifies through players interactions and it can lead to huge 
fluctuations, growing with the number of players. This shows that irrationality is indeed a "relevant parameter" , 
which should be included in game theory. 
,: A game is defined as a mathematical model for optimal strategies among competing players. Typically a player has 
a utility function depending on the strategies of all the players. In this paper we will limit ourselves to the so-called 
C — ■ [ complete information games where every player is aware of all other players' strategies and benefits. Under the basic 
On . assumption of rationality of all players, solutions in game theory are given by Nash equilibria of all players' strategies. 
The nature of a Nash equilibrium differs qualitatively from that of an equilibrium state in statistical mechanics. Nash 
equilibria do not result from just the maximization or minimization of some global function (such as e.g. the free 
energy in statistical mechanics) but rather from the requirement that each player's strategy must simultaneously be a 
local maximum with respect to his own strategy. Loosely speaking, in game theory there is not a unique Hamiltonian, 

• but rather each player has his own Hamiltonian to minimize. The interactions among players need not be symmetric 
Q ' and their goals may be in conflict with one another. Finally Nash equilibria gives an exact deterministic answer in the 
O ' sense that it includes no fluctuations. In a parallel with statistical mechanics, one could say that game theory is a "zero 

temperature" theory. The aim of this paper is to include "thermal" fluctuations in game theory through the Langevin 
approach. For any realistic game, this issue is of utmost important since no player can have infinitely precise actions. 
The "zero temperature" nature of game theory, resulting from complete rationality, was indeed recently questioned 
We show that the effects of fiuctuations, in standard games of a large number of competing players, can be quite 
dramatic and that they characterize the stability of Nash equilibria. 

The simplest, economy motivated model of game theory was introduced by Cournot in 1838 [Q: 2 firms produce 
quantities xi and X2 respectively, of a homogeneous product. The market-clearing price of the product depends, 
through the law of demand-and-offer, on the total quantity X = xi + X2 produced: P{X) = a — bX. The larger X 
the smaller P is. The model assumes that the cost of producing a quantity Xi is cXi and c < a. The firms choose 
their strategies (i.e. Xi) with the goal to maximize their profit (utility): Ui — Xi[P{xi + X2) — c]. The problem is 
to find Xi assuming that both firms behave rationally. The best response x\{x2) of firm 1 to any given strategy X2 
of firm 2 is obtained by maximizing ui{xi,X2) with respect to xi with fixed X2- Firm 2, assuming that 1 behaves 
rationally (i.e. that it will play x\{x2) whatever X2 is) will choose X2 which maximizes U2(x^(x2), X2). This leads to 
x^ —— X2 ~~ (o c) /36. This solution highlights the essential point of the concept of Nash equilibrium , which applies 
also to more general games. 

In a situation with n players, we consider 

Ui = XiV{xi -I- . . . -I- a;i + . . . + x„). (1) 

In general, one requires that V{X) be a decreasing function of X . This describes, apart from a demand-and-offer law, 
also situations where the gain of each player depends on a common resource. As X = xi + . . .-\-Xn grows, the resource 
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is depleted {V{X) decreases). V{X) can eventually turn negative for X > Xq: the resource has been exhausted and 
production gives rise to negative benefit for all the players. Generally one has V ^ —■n~^V {V denotes derivative 
here and below) as a consequence of the fact that each Xi have an effect 1/n on a global quantity V. We shall consider 
—oo < Xi < oo. A negative Xi is a quantity that, instead of being produced and sold, is bought by player i. We shall 
also discuss briefly the effects of the constraints Xi > 0. 
Technically, the Nash equilibrium is obtained by solving 



Hi \X \ , . . . , Xn ) 
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= Vi. (2) 

Xj —Xi \/j 



This equation contains the maximization of the utility of player i and his expectation that all other players will do 
the same For the generalized Cournot model with n firms, a — c = 1 and b — 1/n (i.e. V = 1 — X/n), eq. (|^) 
gives the Nash equilibrium 

Xi=XN = " and Ui = - — (3) 
n + 1 (n + ly 

Note that Ui is of order 1/n: the common resource is nearly exhausted ~ due to the aggressive strategies ~ 1. 

It is interesting to compare the above to the case where each player acts to maximize the total utility U{xi^ x„) = 
Ui. In this case Xi is given by dU /dxi = and the result is quite different: Xi = \/2 and Ui = 1/4. Strikingly the 
profit of each player in this case is a factor n larger than in the previous case! 

This is a typical lesson [Q of game theory: when each player acts to maximize his own utility Ui, the global utility is 
very small. The global utility is maximized when all players have a common goal. This is very similar to the dynamics 
in statistical mechanics where all degrees of freedom evolve to optimize an Hamiltonian. The maximal utility state, 
in spite of being "socially" better (everybody behaves less aggressively and receives a better payoff), is unfortunately 
never achieved since incentives to cheat are large. This fact will emerge clearly from the analysis of fluctuations. 

In the Nash equilibrium instead, everybody is more aggressive (larger Xi) and per player benefit is much more 
meager. The crucial features which makes this state more relevant than the social one is its stability: The Nash 
equilibrium is stable because each player has no incentive to cheat since an over- aggressive move {xi > xn) would 
hurt the player himself. 

It is important to note that the Nash equilibrium can be reached dynamically, like for example in a repeated game 
where the players adjust their strategies according to the gradient: dtXi = dui/dxi. This observation suggests that 
a "finite temperature" can be included in the system, by considering the Langevin-like equation (in suitable units of 
time) : 

dtXi = ^+r]i,- (4) 

where rii{t) is gaussian noise with {rji{t)) = and {rii{t)rjj{t')) — DSij6{t — t'). D, in the statistical mechanics analogy, 
plays the role of a finite temperature. 

If Ui is given by eq. (0), it is possible to find the stationary state distribution P{x). Indeed, since V depends only 
on X — Xi, it is convenient to perform an orthonormal transformation in the space spanned by x — (xi, . . . , Xn) 
into y = (ui, . . . ,yn) in such a way that ?/„ = X/y/n. The Gram-Schmidt method ||] then gives yk — (X]i<fc ~ 
kxk+i)/\/k{k + 1) for k < n. 

In the new variables, the dynamics reads 

dtyk V'yk + T)k, 1 <k <n (5) 
dtVn^ VnV + V'yn + fin, (6) 

and, by orthonormality {fii{t)fjj{t')) — DSi,j6{t — t'). 

This transformation has the virtue of displaying the statistical dependence of the variables in a natural way. Since 
V and V' depends on yn only, ?/„ has a dynamics which is independent of the yk, whereas each yk is coupled to y„. 
Therefore the stationary distribution can, in general, be expressed as P{if) = P{yn)Y[k<n ^iVklUn)- where P{yk\yn) 
is the distribution of yk conditional to t/„. Eq. (^ describes a "particle" in a potential with thermal fluctuations and 
can be solved using standard techniques The same holds for eq. (|^), where y„ appears as a parameter. We find 
P{y) cx exp[-iJ/D] and 

V"^ ^ Vy^ n-1 
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This form of the stationary distribution is reminiscent of an equiHbrium system with Hamiltonian H. It would however 
be misleading to identify ~H with some measure of the utility U . This stationary state has a completely dynamic 
origin. 

The equilibrium distribution, for D small, can be expanded around its maximum. The maximum of H{y) is attained 
at = Sk,nV^^N, in agreement with eq. (^. The gaussian fluctuations around the Nash equilibrium are found in 
the standard way: Expand H{y) up to second order in Syk — Vk — vl- The inverse of the matrix of the quadratic 
form, yields the fluctuations (SykSyj). In view of eq. (^, one finds (SykSyj) = SkjD/\V'\ for k < n. Note that since 
~ n~^, these fluctuations are of order n. Because of these huge fluctuations, we shall call yk "soft modes". The 
fluctuations of y„ instead turn out to be of order 1. One can infer the fluctuations of the x^'s by using the identity 

n n 

i=i k=i 

and assuming {6xi6xj) — {A ~ C)Sij + C. We discuss here only the case V = 1 — X/n which allows more compact 
expressions. The same features discussed below apply to any V{x) such that V' ^ —n^^. Observing that (y,^) = 
A+ {n — 1)C, and using eq. (||) one finds 



The main message of eq. (P) is that fluctuations around the Nash equilibrium are very strong: The relative 
fluctuation of Xi, given that the avergage of Xi is close to one, is proportional to ^JnD. This depends on the fact that 
\V'\ ^ n~^, which is a very general feature in large games of the form (|l|). The variable Xi fluctuates the same order 
of magnitude as the sum of Xi over all i = 1, . . . , n. This is possible because of the negative correlation among the 
variables. A fluctuation of one of the variables is compensated by opposite fluctuations of the others. 

Let us see what happens to the total utility. This is best seen in the variables yu, because U — \/riynV{^/nyn) 
depends only on y„. Therefore expanding U up to second order around y* and taking the average, we find 

\n + Xy n + 1 

where, again we assumed ViX^ = 1 — Xjn. As can be easily seen, the fluctuations decrease the utility by a term 
W ~ —D and they can have a dramatic effect: If Z) > Dc ~ 1 the average utility becomes negative! 

With respect to the dynamics, it is easy to check that the correlation function of the "soft" modes yk-, in the steady 
state for the linear V(x), is 

(yfeW2/fc(t + T)) ~ni:'exp(-T/n). (11) 

which implies very long correlation times in the stationary state. This applies to the correlations of xi as well. 

The features discussed thus far hold the same if the constraint Xi > is imposed. The fluctuation around the Nash 
equilibrium eq.(^), at the level of the Gaussian approximation are still given by the above results. These characterize 
correctly the neighborhood of the Nash equilibrium. The corrections to the gaussian fluctuations are negligible when 
&Xi is much less than xat, which occurs for D <C . Numerical simulations show that, even for larger Z?, the same 
qualitative features (large fluctuations and eventually i7 < 0) hold also in the presence of the constraint > 0. 

It is instructive to study the "social" equilibrium in the same way. Now each player attempts to maximize the total 
utility t/, and the Langevin equation is dtxi = dU /dxi + rji. In the variables y we flnd: dyn — \/nV + nynV + fin 
and dyk — fjk for k < n. Note that yk now behave as random walks. The distribution of yk at long times is 
P{y,t) (X exp[—yl/(2Dt)] and the correlations are (y^) = Dt for fc < n and {Sy'^) ^ D. This implies unbounded 
fluctuations of Xi (i.e. {5xf) ~ Dt) and a negative correlation {SxiSxj) / {Sxj) — > — l/(n— 1) such that the fluctuations 
of the sum X are finite. This implies that the average utility remains finite. The absence of a stationary distribution 
in the "social equilibrium" reflects its instability. 

The results generalize with little qualitative changes when one considers a more general correlation among rji or a 
mixed "social" egoistic model. These and other generalizations, as well as more detailed calculations, will be presented 
in a forthcoming publication. 

It is generally recognized in economy that in realistic situations the relation between the utility and wealth (net 
profit) is not linear pi| . One source of non-linearity, for example, is inefficiency in capital management. This can 
be tolerated by the rich whereas it is very dangerous for the poor. Most studies |ll|| assume empirically a quadratic 
relation Il2]. This, assuming XiV as a measure of the wealth of player i, leads to 
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u,=x,V{X)[l-rx,V{X)]. 



(12) 



Let us define x — X/n and assume that V = I — x (with little loss of generality since non-linearities in V do 
not change qualitatively the results). The interesting feature of this model is that if r > 2 a new Nash equilibrium 
appears. Indeed 



duj 
dx. 
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x][l- 2rx{l - x)] = 0, 



(13) 



for r > 2, has three solutions: 



xjv, the usual Nash equilibrium eq. 



, X — Xr ~ {r — \/r(r — 2)) /2r, which is a 
new Nash equilibrium, and x — x+ ~ {r + ^r{r — 2))/2r, which is an unstable equilibrium (i.e. a minimum of the 
utility). Provided 2 < r < n/2, one has Xr < 2^+ < xj^. The utility in the new equilibrium is Ui{xj = Xr^^j) = 1/(4?') 
which is positive and finite as compared to that &t xm which is 0(l/n). In the presence of two equilibria a player will 
choose one or the other according to what he judges other players will do. 

Situations with more than one stable solution are frequent and of great interest in economy In particular one 
would like to know under what conditions a state is selected. The framework of Langevin dynamics (^ is particularly 
appealing. Indeed as we shall see it allows to understand which state is globally stable and how long a transition from 
the other state into it will take. 

From eq. (W) we can derive the equation for x: 



dtx 



X 

XN 



[1 - 2rxil 



2r(l - x) 




(14) 



Here 77 is the average of rjt, i.e. it is a white noise with equal time correlation D/n. In spite of the fact that all Xi 
appear in eq. (p^), it is still useful to use the variables yk- Indeed one can use the identity (^) and average over the 
degrees of freedom y^ in eq. (p^. This amounts to replacing the term in brackets by {n — l){yl\x) Q which is the 
average of y^ conditional to a fixed x. In order to close the equations, we assume that in the Langevin equation for 
yk, 



dtyk 



2r{l - xf + 



1 



yk 



2r(l - x) 
n^Jk{k + 1) 



kx\j^i 



rjk- 



we can neglect the second term in the right hand side. 
This can be justified by the expectation that this term is 
negligible for n ^ 1 if ~ x| . The equation for y^ then 
simplifies considerably and one finds that, in the steady 
state, 



(ylm 



Dn 



(15) 



4nr(l -5)2 + 2' 
This, in eq. (p^), gives (to leading order in n) 

dtx=— [l~2rx{l-x)] + - — ./ ' ^ +fy, (16) 



XN 



4nr(l-5)2 + 2 



which can be cast in the form dfX — —dH/dx + 77, where 
—Hix) is the integral of the deterministic part of eq. 
(|l|). Since {fi{t)f]{t')) = {D /n)6{t ~ t'), the stationary 
solution is P{x) cx exp[—nH{x)/D]. Here H plays the 
role of a free energy. Indeed it has the form H = E — TS 
where T = D/n is the analogous of the temperature and 
S is the entropy. The entropy enters from the fluctua- 
tions of the degrees of freedom y^ which have been self 
consistently retained in the equation for x. Let us dis- 
cuss, from this point of view, the statistics of x in the 
steady state: Fixing E{x+) = 0, the energy in the two 
equilibrium states are, to leading order in n, 
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E{xn) 



2r-3-4:r{r-2)xr 
2^ 



E{Xr) 



{r-2){l-2xr) 



+ 0(n-i) 
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Therefore E{xn) < E{xr) in the interval 2 < r < 9/4 
whereas E{xm) > E{xr) for r > 9/4. It is important 
to stress that this energy cannot be interpreted as —U . 
Indeed note that the minimum energy is at a; at for 2 < 
r < 9/4, whereas the maximum utihty U is always at Xr- 

Energy alone suggests therefore that the system will 
fall in the minimum energy minimum for t — > oo, and 
this, in the limit n — * oo, is xn for r < 9/4 and Xr for 
r > 9/4. This conclusion holds to leading order in n even 
if one considers also the entropy. The reason is that, for 
n — > oo, one is considering very small temperatures T = 
D/n. Direct calculation shows that S{xn) = (logn)/4 + 
0(n-2) while S{xr) = [log {2rXr - l)]/4 + O(n-i). The 
entropy in xn is considerably larger than that in Xr- In- 
deed the fluctuations ofyk are of order ^/n in xn, whereas 
in Xr they are finite [see eq. (|l5|)]. In other words, since 
(Sxf) ~ {yll^}, the set {xi} is much more widely spread 
in the xn minimum than in the Xr one. And this is an 
effect which is correctly accounted by the entropy above. 

Even though we can identify a globally stable equilib- 
rium (a; TV for r < 9/4 and Xr otherwise) which will ul- 
timately attract the system under the Langevin dynam- 
ics, it is important to stress that the other metastable 
equilibrium can be stable over times which are exponen- 
tially large in n. Indeed the energy barrier between the 
two minima is finite, but the temperature is very small 
T = D/n. If r > 9/4 and initially the system is in the xjv 
equilibrium, it will not visit the state Xr before a time of 
the order of ~ exp[ni?(a;jv)/Z?]. This time can be infinite 
for all practical purposes. In other words, the system is 
very sensible to initial conditions. 

We have presented a general approach to extend game 
theory to include fluctuations. We used the Langevin for- 
mulation which provides a natural bridge between game 
theory and statistical mechanics. The essential differ- 
ence is that individual utility functions replace a global 
Hamiltonian. Fluctuations describe in a natural way the 
stability nature of various equilibria. We find that the 
Nash equilibrium of simple games with competition is 
stable against thermal fluctuations, even though the am- 
plitude of fluctuations is very large. On the contrary, the 
"socially ideal" state is marginally unstable due to the 
presence of "soft modes". The approach also allows to 
study situation with more than one Nash equilibria and 
identifies the globally stable one as well as the criteria 
under which a state is reached by the dynamics. 

This work was supported by the Swiss National Foun- 
dation under grant 20-40672.94/1. 
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